QSAR Study on a Series of Protein Tyrosine Phosphatase 1B Inhibitors.

As a therapeutic target, protein tyrosine phosphatase 1B (PTP1B) has received considerable attention for the treatment of diabetes mellitus. A QSAR study using substituted monocyclic and polycyclic thiophene derivatives, recently reported as potent PTP1B inhibitors, was carried out. More than 60 physicochemical descriptors were calculated which underwent rational selection before their use in derivation of QSAR models. Statistically significant equations were generated using multiple linear regression analysis. External validation of the derived models with test set compounds proved good predictability of the models. Interpretation of the results revealed lipophilicity as a key regulatory feature which affects PTP1B inhibition along with several electronic and steric parameters. The study provides an important platform upon which novel rationally designed molecules can be synthesized with cautious optimism.


INtrODUctION
Type-2 diabetes mellitus is the only non infectious disease recognized as epidemic by WHO because of its worldwide diffusion especially in western lifestyle countries (1). It is characterized by chronic elevated blood glucose levels. The increased incidence of type-2 diabetes mellitus and obesity in the population has fueled an intense search for new therapeutic treatment options (2). Resistance to the hormone insulin in the muscle, liver and in central nervous system (3,4) is characteristic of both type-2 diabetes (1) and obesity (5). Drugs that can ameliorate this resistance should be effective in treating this disease. Protein tyrosine phosphatase (PTPase) catalyzes the removal of phosphate group from phosphotyrosil residues in many proteins.
PTP1B is the first purified PTP (6) and has been demonstrated to dephosphorylate the insulin receptor and thereby attenuate the tyrosine kinase activity, thus acting as negative regulator of insulin signaling (7,8). A recent study using PTP1B knockout mice showed increased insulin receptor phosphorylation and enhanced sensitivity to insulin in skeletal muscle and liver. In addition, PT-P1B knock out mice have remarkably low adiposity and are protected from diet induced obesity (9,10). Most importantly these mice appeared to be normal and healthy, which indicates that specific inhibitors would be free of side effects and have selective therapeutic efficacy. Thus, PTP1B has become an attractive therapeutic target for the treatment of type-2 diabetes and obesity. A number of research groups have developed small molecules targeting this enzyme (11)(12)(13)(14)(15)(16).

ORIGINAL ARTICLE
Quantitative structure activity relationship (QSAR) plays an important role in lead structure optimization. QSAR methods attempt to capture the relationship between structural features of molecules and their biological activities. Among the several PTP1B inhibitors described in the literature, mono, bi and tricyclic thiophene derivatives present interesting small molecule targets for drug design due to their synthetic accessibility and high potency (17)(18)(19). In the present study, a data set of 33 compounds, considered to be PTP1B inhibitors, was used to develop 2D QSAR models to help optimizing the lead from the derived information.

Data set
All calculations were carried out on a windows based PC workstation, using the software package Chemoffice. A series of 33 compounds reported by Wan Zhao-Kui et al (17)(18)(19) were used for the present QSAR study. The chemical structures and biological properties for the complete set of compounds are listed in Table 1-4. The data set was divided into a set of 27 training compounds (compound 1-27) and 7 test compounds (compound 28-33).
The Ki values employed in this work (varying from 0.2 to 160 µM), measured under the same experimental conditions, are acceptably distributed across the range of values. Thus, the data set is appropriate for the purpose of QSAR model development. The Ki values were transformed to pKi (-logKi) before being used as dependent variables in the QSAR investigations.

Descriptor calculation and selection
Dragon 5.4, ChemOffice and TSAR 3.3 softwares were used to generate the physicochemical descriptors for the QSAR studies. Descriptors were obtained for the whole molecules. This procedure afforded 65 descriptors, which were subjected to the following selection strategy.
Reduced descriptors were obtained by discarding highly inter-correlated (r>0.8) descriptors and selecting descriptors that appeared with higher frequency in previous models. After that, the descriptors with r >0.9 between the activity and the descriptor were introduced to build the QSAR model. Also, descriptors possessing constant values as well as those with poor correlation to biological activity (r 2 <0.10) or that are more than 0.99 correlated were discarded.

QsAr model development
The TSAR software was employed to systematically search for models of up to five variables that gave rise to multiple linear regression (MLR) models. Statistical measures used in stepwise multiple regression analysis are: n-number of compounds in regression, r-coefficient of correlation, r 2 -squared correlation coefficient, s-standard error or estimation and F-test (Fischer's value) for statistical significance. Values of all the descriptors used in the study are given in Table 5.

RESULTS
An important step in classical QSAR modeling is the selection of appropriate descriptors that are correlated to biological activity. Due to the large number of descriptors available, they were selected based on their biological activity and capability of producing MLR models with up to four descriptors with correlation (r 2 >0.8). This strategy had two goals: to build initial QSAR models that could shed light on structural features important for PTP1B binding, and to select a subset of the most correlated descriptors that could be further explored in QSAR model development. QSAR analysis from the 17 various descriptors generated many equations. Those which were statistically significant are shown in Table 6 along with their statistical parameters. The predictive power of the best QSAR model derived using the 27 training set molecules was assessed by predicting Ki values for 6 test set compounds (28-33), not used for QSAR model development.
The external validation process can be considered the most reliable validation method, as cross-validation procedures may lead to very optimistic statistics (20).
The results of the external validation are listed in Table  7, and the graphic results for the experimental versus predicted activities of both training set and test set are displayed in Figure 1.

DIscUssION
Very low residuals in the test set model signify good predictability of the models. All other eleven models (equation 2-12) were found to possess good predictive power when subjected to external validation using test set compounds (data not shown). Good predictability as well as statistical significance (low s; high r 2 and F values) turned out as peculiar features of the developed models satisfying primary requirements for a QSAR study.
Preliminary structure activity relationship studies revealed that monocyclic substituted thiophenes (compound 1-11) were more potent than fused thiophenes (compound [12][13][14][15][16][17]. Also, an amide group or a secondary amino group increased activity significantly (as seen with compound 7-11, 21-26, 33) due to its involvement in H-bond interaction with Asp48 residue of the enzyme (19). Lipophilicity turned out to be an indispensable feature for showing PTP1B inhibition, which is consistent with the well-established interactions of ligand with hydrophobic Met258 side chain (17). In all the QSAR models developed, logP contributed significantly to the biological activity. The results also correlate with the fact that the most potent compounds (10 and 11) had the highest logP values (4.6 and 5.92, respectively). Additionally, the compound 5 with Ki value of 160 µM (pKi=3.8) had lowest value of logP, viz. 1.54. Thus, design of lipophilic compounds should be the strategy for future PTP1B inhibitors in this series. Various electronic parameters also contributed significantly to the biological activity. Polarizibility appeared to be affecting biological activity positively where as a negative effect of molar refractivity was observed. Effect of steric descriptors remained variable. Yet, valence connectivity indices, molecular weight, parachor and randic topological index were found very useful to generate statistically significant and predictive QSAR models.

CONCLUSION
Twelve predictive QSAR equations were drawn with significant descriptors, the most important being lipophilicity, electronic and steric parameters, by multiple linear regression analysis. Excellent correlation with lipophilicity was observed, suggesting future design of additional lipophilic compounds. A little information was obtained about the effect of electronic descriptors on biological activity due to its variable effects in different models. As the models were predictive in na- ture, effect of electronic descriptors can be taken into account quite satisfactorily, when each model is taken individually. Models also gave an insight in terms of incorporation of group based on their electronic nature, i.e. electron release or withdrawal. Similarly, steric features were also well correlated with biological activity.
Although it seemed difficult to judge its absolute effect, good predictive models were derived when steric features were taken along with different sets of lipophilic and electronic parameters.

ACKNOWLEDGEMENT
PSB thanks Council of Scientific and Industrial Research (CSIR), for financial support through award of Senior Research Fellowship.

Predicted pKi values
Observed pKi values